Hyperspectral image compressed processing: Evolutionary multi-objective optimization sparse decomposition

In the compressed processing of hyperspectral images, orthogonal matching pursuit algorithm (OMP) can be used to obtain sparse decomposition results. Aimed at the time-complex and difficulty in applying real-time processing, an evolutionary multi-objective optimization sparse decomposition algorithm for hyperspectral images is proposed. Instead of using OMP for the matching process to search optimal atoms, the proposed algorithm explores the idea of reference point non-dominated sorting genetic algorithm (NSGA) to optimize the matching process of OMP. Take two objective function to establish the multi-objective sparse decomposition optimization model, including the largest inner product of matching atoms and image residuals, and the smallest correlation between atoms. Utilize NSGA-III with advantage of high accuracy to solve the optimization model, and the implementation process of NSGA-III-OMP is presented. The experimental results and analysis carried on four hyperspectral datasets demonstrate that, compared with the state-of-the-art algorithms, the proposed NSGA-III-OMP algorithm has effectively improved the sparse decomposition performance and efficiency, and can effectively solve the sparse decomposition optimization problem of hyperspectral images.


Introduction
Hyperspectral images (HSIs) contain rich spatial geometric information and spectral feature information, which are suitable for target detection and recognition, image classification and other fields [1][2][3]. The expansion of the application field proposes complete spatial resolution and spectral resolution, needs to reduce the dimensionality [4][5][6] of hyperspectral data. In the process of dimensionality reduction, how to ensure the integrity of information is a big problem. Furthormore, sharp increase in the amount of information, further brings challenges to data storage and transmission. The traditional Shannon sampling theorem is no longer suitable for large data volume hyperspectral image processing.
Compressed Sensing (CS) theory [7,8] was proposed in 2006, with the characteristics of directly collecting data information. The essence of compressed sensing theory is to extract as Using the Gabor function to generate a redundant dictionary for sparse representation of the signal can ensure the global information of the signal and the intensity of the signal change in any local time. The generation function of the Gabor dictionary is [20,21], Where, win n ð Þ ¼ e À pn 2 is the Gaussian window function, n = 0,1,. . ., N, N is the signal length, γ = (s,u,υ,ω) is time-frequency parameter. Let D = {g γ } γ2Γ denote the redundancy dictionary, in which g γ is the atom defined by the parameter group γ, Γ is the set of parameter groups. The time-frequency parameters are discretized according to the following methods: g ¼ a a 1 ; a 2 a a 1 Du; a 3 a À a 1 Du; a 4 Do ð Þ, whereas, a = 2, Δu = 1/2, Δυ = π, Δω = π/6, 0 < α 1 � log 2 N, 0 � a 2 � 2 À a 1 þ1 N, 0 � a 3 < 2 a 1 þ1 , 0 < α 4 � 12. The Gabor dictionary is highly redundant, and the number of atoms is N atom = 52(Nlog 2 N + N − 1).

Sparse decomposition of OMP
The basic idea of OMP for sparse decomposition is to find the optimal atoms from the redundant dictionary that can best match the image signal. The calculation process is as follows.
First, select the atom g γ0 that best matches the image signal f to be decomposed from the redundant dictionary, and meet the following conditions, Where, h•i represents the inner product. The image can be decomposed into two parts: the component on the best atom g γ0 and the residual, namely, Where, R 1 f is the residual error after matching the original image with the best atom. Continuously carry out the above decomposition process on the residual, Where, R k f; g g k D E represents the component of the image residual R k f on the corresponding atom g g k .
After K times decomposition, the image is decomposed into, Under the condition that the signal satisfies the limited length, the exponential decay kR k fk is becomes zero with the increase of k. A small number of atoms can represent the main components of the image, namely, Where, K << N.
Assume the optimal atom set is Θ K , the original signal could be represented by the K atoms, denoted as, Where, θ is the sparse coefficient vector. Utilize least square method to solve the sparse coefficient vector, The reconstructed signal could be expressed as, The decomposition steps of OMP can be summarized in Algorithm 1.
4: Update the optimal atoms set with g γk to obtain represents the pseudo-inverse of the matrix Θ k . 6: Compute k = k + 1. 7: end 8: Compute the reconstructed signalf according to Eq (9).

Sparse decomposition optimization model
Analyzing the sparse decomposition process of the OMP algorithm, we find that the selection of the optimal atom is completed by calculating the inner product of all the atoms in the redundant dictionary and the decomposition residuals. In order to improve the accuracy and efficiency of sparse decompositionprocess, this paper takes, 1) the maximum inner product of the matching atom and the image residual, and 2) the minimum correlation between the atoms, as the two optimization goal, to construct a multi-objective sparse decomposition optimization model.
According to Algorithm 1, after k − 1 sub-decomposition, the residual is R k−1 f, and the optimal atom set is Θ k−1 , then in the kth decomposition process, the first objective function that needs to be optimized is the maximum inner product of matching atoms and residuals. The first objective function is computed as, In order to ensure the orthogonality between atoms and make the decomposition process more accurate, the second objective function that needs to be optimized is the minimum correlation between atoms. The second objective function is computed as, In order to ensure the unity of the multi-objective solution process, taking the minimum of each objective function as the optimization objective. The multi-objective sparse decomposition optimization model is expressed as, Proposed NSGA-III-OMP algorithm The basic idea of NSGA-III-OMP algorithm is: in the iterative process of OMP sparse decomposition, the NSGA-III evolution process is used to replace the original atom matching process. The main advantage of the algorithm NSGA-III is that, by defining a hyperplane and multiple reference points, it selects the individuals which can best adapt the environment, in order to maintain the diversity of the population and ensure that decision makers can find the optimal solution. Firstly, we describe the NSGA-III algorithm in detail, including the initilization, selection, the crossover and mutation operation. Following that, the flow chart and implementation of NSGA-III-OMP is at the last.

NSGA-III algorithm
In view of the characteristics of the evolutionary algorithm to solve the multi-objective sparse decomposition model, there is no need to generate a Gabor dictionary in advance, and a realvalued encoding method is used to construct chromosomes, and each chromosome represents an atom. According to the generating formula of Gabor atom, the length of chromosome is D = 4, and its value is the vector represented by (s,u,υ,ω). Suppose the population size is M, the maximum evolutionary generation is G max , and the number of objective functions is P. Genetic operator is the key to ensure the optimization of chromosome update, usually including selection, crossover and mutation [22,23].
Initialization operation. The initial parent population X par 0 is generated in a random manner. The mth element in X par 0 is expressed in Eq (13), Where, (s,u,υ,ω) is the random number satisfying the range of specific parameter, m = 1,2,. . ., M. Selection operation. The selection mechanism of the NSGA-III algorithm [24] is different from the traditional genetic algorithm. The specific steps include: non-dominant sorting, define the reference point, normalize objective function, association operation and filter operation.
Before the first selection operation, crossover and mutation operations are performed on the initial parent population X par 0 to generate offspring population X spr 0 . Merge the parent population X par 0 and the offspring population X spr 0 to form a composite population Compute the fitness and take the top M members from the composite population X com 0 . The individuals of the composite population X com 0 are divided into L layers according to the non-dominated rules [19]. The first L − 1 layers of individuals is denoted as X sel and the number is |X sel |. If |X sel | = M, the next parent population is started with X par Gþ1 ¼ X sel , else we need to select other M − |X sel | members from the Lth layer.
2) Define the reference point. We employ an efficient technique that spaces points on a normalization hyperplane with an intercept of one on each axis and is equally inclined to all objective axes. The number of objective functions is P, the length of chromosome is D, then the number of reference points is J ¼ C D PþDÀ 1 , j = 1,2,. . .,J, and the reference points can be set on the normalized hyperplane of dimension P − 1. We believe that the optimal solutions are widely scattered across the complete normalized hyper-plane.
3) Normalize objective function. Calculate the ideal point in population X sel by recognizing the minimum value b min p for each objective function p = 1,2,. . .,P, and build the ideal point The intercept a P of the p-th objective axis and the linear hyper-plane can then be computed according to the reference [19], and the normalized objective functions can be represented as, Assume the structured reference points is H s , we utilize Eq (16) to map each reference point onto the normalized hyperplane, and save the reference points in H n .
4) Association operation. Connect the reference point and the origin of hyperplane to form a reference line, calculate the perpendicular distance from all individuals in X sel to the reference lines. Select the nearest reference line, and associate the individual with the corresponding reference point. Through association operation, we associate individuals with reference points. It may appear that a reference point is associated with one or more population individuals, or some reference points are not associated with any population individual. At this point, we need to perform filter operation. 5) Filter operation. Determine the number ρ j of first l − 1 layer individuals associated with each reference point j, identify the reference points set J min = {j:argmin j ρ j } having minimum ρ j . Randomly select a reference point from j 1 , if r j 1 ¼ 0 and j 1 is associated with multiple individuals in lth layer, then select the individual closest to the reference line j 1 O to enter the next generation. If r j 1 ¼ 0 and j 1 is not associated with multiple individuals in lth layer, then randomly select an individual to enter the next generation. Repeatedly select reference points from J min , until the individuals are supplemented [19] and the parent population X par Gþ1 operation is completed.
Crossover and mutation operation. Since we have already performed an elitist selection operator on the parent population, the population diversity could be maintained. Therefore, we apply usual crossover and mutation operators to create the offspring population X spr Gþ1 . According to the crossover probability, it is judged whether the individual has crossover and the gene position where the crossover occurs. The analog binary crossover operator is used to generate the progeny population. According to the mutation probability, determine whether the individual has mutation and determine the gene location where the mutation occurred, and adopt the basic mutation method to determine the individual value after mutation.
NSGA-III process. After continous iteration in NSGA-III evolution process, we obtain the optimal offspring solution set X spr G max . Compute the contribution degree of the individual objective function value in the total objective function value, the individual X optimal (optimal = 1,2,. . .,M) with the smallest value of f optimal is selected as the optimal atom. The smallest value of f optimal is computed as, Where, X m m ¼ 1; 2; . . .; M ð Þ 2 X spr G max . Combined with the description of the above operators, the implementation steps of NSGA-III is summarized in Algorithm 2.
Algorithm 2: NSGA-III Inputs: the evolution generation G max , the initial parent population X par 0 , the objective function f p , p = 1,2,. . .,P, the structured reference points H s . Output: the optimal solution X optimal . 1: Perform crossover and mutation operations to generate the initial offspring population X spr 0 , set iteration G = 0.
Points to be choosen from Lth layer: M − |X sel | 15: Normalize the objective functions using Eq (16) and save the reference points H n .

16:
Associate each member of X sel with a reference point.

17:
Excute the filter operator to obtain the parent population X par Gþ1 . 18: endif 19: Perform crossover and mutation operations to generate the offspring population X spr Gþ1 . 20: Compute G = G + 1 21: endWhile 22: Find the optimal solution X optimal using Eq (17) from X spr G max .

Implementation of NSGA-III-OMP
We use the NSGA-III algorithm to replace the atomic matching process in OMP, and propose the NSGA-III-OMP algorithm. The implementation of NSGA-III-OMP is summarized in Algorithm 3. According to Algorithm 2: NSGA-III, with the parameters M and G max , the optimal individual X optimal is obtained as the optimal atom g k . 3: Update the optimal atoms set with g γk to obtain represents the pseudoinverse of the matrix Θ k . 5: Compute k = k + 1. 6: end 7: Compute reconstructed signalf according to Eq (9).

Hyperspectral datasets
Four hyperspectral datasets [25,26] with diverse signatures are included in the investigations, allowing for a complete quantitative and quantitative comparative evaluation of the proposed scheme. The datasets are namely Cuprite1, Cuprite2, Indian Pines collected by AVIRIS and Pavia University collected by ROSIS. The water absorption and noisy bands in the original data set have been removed, and the image has been spatially cropped to the block size B = 16 for computation convenience. The basic conditions of the 4 sets of data are shown in S1 Table. The original image of the 50th band of hyperspectral data is shown in Fig 1.

Evaluation metrices
The performance of the algorithm is evaluated using the average peak signal-to-noise ratio (PSNR) [27], average structural similarity (SSIM) [28], average vector-based SNR [29] and average vector-based spectral-angle distortion (SAD) [29], running time and reconstructed image. The running software and hardware environment of the experiment is: AMD quadcore CPU, 3.80 GHz, 16 G memory, Matlab2012b.
In the next discussions, we use F = [F 1 , F 2 , . . .F l ,. . .F L ] 2 R N×L to represent the hyperspectral images, N is the pixels of each band image and L is the number of bands, F l is the vector representation of one band image, F[n] is the vector representation of one pixel.
The band-based PSNR measured in dB is defined as, Þ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi Where, F l andF l are the original and reconstructed band images, max(F l ) is the peak value of is the mean squared error [27] and is computed as, The band-based SSIM between F l andF l is computed as, Where, μ 1 and μ 2 are the mean values of F l andF l , σ 1 and σ 2 are the standard deviation values of F l andF l , σ 12 represents the correlation coefficient between F l andF l , C 1 and C 2 are constants related to the dynamic range of the pixel values. The details for these parameters can refer to [28]. The average PSNR and average SSIM are calculated by averaging the band-based PSNR and band-based SSIM over all bands of the hyperspectral dataset.
Vector-based SNR measured in dB is defined as, The vector-based SAD measured in degree is presented as, The average SNR and average SAD are calculated by averaging the vector-based SNR and vector-based SAD over all spectral vectors in the hyperspectral dataset.

Parameter selection
The proposed algorithm NSGA-III-OMP needs to set the number of populations, the maximum evolutionary algebra and the number of decompositions (ie, the optimal number of atoms), and the effects of these parameters on the performance of the algorithm are studied separately.

Evolution parameter
Firstly, the algorithm NSGA-III-OMP is used to sparsely decompose the 40th band image of the 4 groups of hyperspectral data, and the influence of the maximum evolutionary generation, population size and decomposition times on the performance of the algorithm is analyzed. The variation range of population size M is 5~50, the interval is 5, the variation range of the maximum evolutionary generation G max is 5~50, the interval is 5, the variation range of the decomposition number K is 20~100, and the interval is 20. Fig 2 shows the variation of the average PSNR of the reconstructed image of Cuprite2 with the parameters. When the number of decompositions is 100, the average reconstructed PSNR varies with the maximum evolutionary generation and population size is shown in Fig 2(a). Under the same maximum evolutionary generation, the PSNR has a small oscillation with the increase of the number of populations. While under the small population number, with the increase of the evolutionary generation, the PSNR has a slow increase, but the overall change is not large.
When the population size is 10, the influence of the maximum evolutionary generation and decomposition times on the reconstructed PSNR is shown in Fig 2(b). When the maximum evolutionary generation is 5, the influence of population size and decomposition times on reconstruction performance is shown in Fig 2(c). From Fig 2(b) and 2(c), it can be clearly found that the influence of the number of decompositions on the reconstructed PSNR is much greater than the influence of the maximum evolutionary generation and the number of populations. Fig 2(d) shows the influence of the maximum evolutionary generation and population size on the reconstruction time when the decomposition number is 100. It is found that the maximum evolutionary generation has a greater impact on the reconstruction efficiency than the population size. The experimental results of the other three groups of hyperspectral images are similar to Cuprite2, considering the reconstruction accuracy and computational complexity, the number of selected population size in NSGA-III-OMP algorithm is M = 10, and the maximum evolutionary generation is G max = 5.

Decomposition number
The NSGA-III-OMP algorithm and the OMP algorithm are used to sparsely decompose the 40th band image of the hyperspectral data sets, and the algorithm is set to terminate when the number of decompositions reaches 150. Fig 3 shows the changes of the reconstructed PSNR obtained by the two algorithms. The thick line is the accuracy of the reconstructed image obtained by the 50 best atoms using OMP. The experimental results show that the NSGA-III-OMP algorithm cannot achieve the reconstruction accuracy of the OMP algorithm with only 50 atoms. This is because: in each decomposition process of the OMP algorithm, it can find the best matching atoms from the redundant dictionary to guarante the performance. Due to the randomness, the atom searched by NSGA algorithm may not be the atom that best matches the residual. Therefore, more atoms need to be found to fully characterize the original image characteristics and achieve the reconstruction accuracy of the OMP algorithm.
If the PSNR of the reconstructed image obtained by the OMP algorithm using 50 atoms is used as the standard, Cuprite1 and Cuprite2 need about 100 atoms to achieve this standard, while Indian Pines and Pavia University need about 150 atoms to achieve the accuracy of the OMP algorithm. Accordingly, the maximum decomposition number of the OMP algorithm is set to K = 50. For the 4 sets of hyperspectral data, the decomposition number for the algorithm NSGA-III-OMP is set to K = [100,100,150,150].

Results and analysis
Three decomposition techniques are compared in order to better understand the possibilities of the proposed NSGA-III-OMP. The first is the OMP algorithm, described in Algorithm 1. This approach searches the best atoms by traversing all the atoms in the redundant dictionary. The second is the particle swarm optimization based on OMP, denoted as PSO-OMP algorithm, described in literature [30]. The core idea of this algorithm is within the framework of OMP, achieving the best atoms by evolution of particle swarms. The third one is our proposed algorithm, presented in Algorithm 3. This algorithm is within the framework of OMP, exploring the genetic evolution to search the best atoms. Compared with PSO-OMP, except for the differences in evolutionary principles, our NSGA-III-OMP algorithm also optimizes a multiobjective model shown in subsection 2.3, while PSO-OMP solves a single objective model. In addition, the selection operation adopts a non-dominated sorting to maintain the diversity of population and ensure the accuracy of the optimal solution.
The decomposition number of OMP algorithm is K = 50. With reference [30], the population size of PSO-OMP algorithm is 10, the maximum evolutionary generation is 5, and the decomposition number is K = [100,100,150,150] for four datasets, respectively. The parameters of NSGA-III-OMP algorithm are: the population size is M = 10, the maximum evolution generation is G max = 5, and the decomposition number is K = [100,100,150,150] for four datasets.

Reconstruction performance comparison
The band-based PSNR of four datasets is shown in Fig 4. Seen from the figure, the results of PSO-OMP are almost the same as OMP for Cuprite1 and Cuprite2, shown in Fig 4(a) and 4  (b). We can clearly find that the proposed NSGA-III-OMP has obvious advantage over PSO-OMP and OMP. In regard to Indian Pines in Fig 4(c), we discover that for first half bands (maybe 1-100 bands), the NSGA-III-OMP is better than PSO-OMP, and OMP is worst result. Whereas, in the last bands (maybe 101-200 bands), the three algorithms obtain the very consistent accuracy. For the dataset Pavia University, we can clearly see the advantages of evolution algorithms, especially our NSGA-III-OMP has absolute advantage over the other two methods. The average PSNR listed in Table 1 can get the same conclusion as in Fig 4. The average PSNR are computed by the band-based PSNR of four datasets with three methods (see S1 Dataset). These results shows that our algorithm can better find the optimal atoms by solving a multi-objective model using non-dominated sorting selection operation.
The band-based SSIM of four datasets is shown in Fig 5. For Cuprite1 and Cuprite2 images, the SSIM of PSO-OMP is higher than OMP, which is different from the PSNR comparison in Fig 4(a) and 4(b). Especially, the results of our proposed algorithm NSGA-III-OMP are superior to the other two methods. The average SSIM are presented in Table 2. The average SSIM are computed by the band-based SSIM of four datasets with three methods (see S2 Dataset). This shows that our algorithm can recover the spatial structural features of the original image, which is a compelling proof of the proposed algorithm's usefulness.
The average SNR and average SAD are shown in Fig 6 and    The average SAD are computed by the vector-based SAD of four datasets with three methods (see S4 Dataset). The comparisons between the original spectral vector and the reconstructed spectral vectors are shown in Fig 8. In this figure, without loss of generality, we choose one spectral vector randomly from the datasets as a representive for comparison. The reconstructed spectral vectors are very consistent with the original spectral vectors, which is an addition proof of the algorithm's avalibility.
The following comparison with several other state-of-the-art algorithms is presented to further showcase the algorithm's performance. The algorithm 3D-DWT [31] uses three dimension wavelet transform for the sparse decomposition of hyperspectral images. The JOMP [31] computes the common support by sparse coding the vector consisting of patches of all bands, leading to high complexity. The FJOMP [32] finds the common support by sparse coding the

PLOS ONE
training band, improving the computational efficiency. Note that, the Cuptite1 image used in this compariosn is cut from the upper corner to be subimage of size 512×512 pixels for keeping in line with the literatures.

PLOS ONE
capabilities of orthogonal basis is not enough. For the first 12~20 bands, the PSNR of JOMP and FJOMP are higher than NSGA-III-OMP, while the two algorithms dropped significantly with the band number increasing. In contrast to our algorithm, the performance remains stable and obtains the highest average PSNR. Although the comparison is not detailed enough, the results can still reflect the effectiveness of our proposed algorithm.
In conclusion, numerous comparisons between the proposed NSGA-III-OMP and some state-of-the-art algorithms have demonstrated the reliability of the algorithm. In comparison with the OMP and PSO-OMP algorithms, the results and analysis fully indicate that it is feasible to use NSGA-III to solve the constructed multi-objective sparse decomposition optimization model. The optimal atoms searched by NSGA-III not only can represent the spatial features of the band images, but also describe the spectral features of spectral vectors. These results are able to prove the decomposition accuracy of NSGA-III-OMP.

Reconstruction efficiency comparison
The runtime of three decomposition algorithms is shown in Fig 10. From the histogram, the evolution algorithms, PSO-OMP and NSGA-III-OMP, can greatly improve the efficiency of optimal atoms matching. In terms of efficiency, NSGA-III-OMP has a slight advantage over PSO-OMP. Compared with OMP, the proposed NSGA-III-OMP can increase the calculation speed by an order of magnitude.

Reconstruction image comparison
As compared to other metrices, a viewer's preference for pleasant visual quality is more relevant. After the data set Cuprite1 is sparsely decomposed, the comparison between the reconstructed image and the original image is shown in Fig 11. The figure shows the original image and the 40th band of the reconstructed image. The reconstructed PSNR of OMP algorithm, The comparison between the reconstructed image and the original image of Pavia University is shown in Fig 12. The figure shows the original image and the 40th band of the reconstructed image. The reconstructed PSNR of OMP algorithm, PSO-OMP algorithm and NSGA-III-OMP algorithm can reach 39.66dB, 42.96dB and 46.38dB respectively.
Seen from these reconstruction images, the reconstructed image can well describe the detailed features of the original image, which fully demonstrates that the NSGA-III-OMP algorithm can find the optimal atoms by using the evolution process of the chromosome, complete the high-precision sparse decomposition of the image, and fully demonstrate the effectiveness of the algorithm.

Conclusions
Analyzing the characteristics of hyperspectral images, a sparse decomposition strategy based on NSGA-III-OMP is proposed. The algorithm utilizes the reference point non-dominated sorting genetic method to solve the constructed multi-objective sparse decomposition optimization model. The algorithm uses the chromosomes of the genetic algorithm to simulate the atom matching process in OMP, and explores the selection, crossover and mutation operators to search the optimal atom. The influence of the population size, the maximum evolutionary generation and the decomposition times on the performance of the algorithm is studied, and reasonable parameters are set in the following experiments. Performance comparisons between proposed NSGA-III-OMP algorithm and other algorithms demonstrate that, our proposed algorithm can effectively improve the reconstruction accuracy and computational efficiency of sparse decomposition.